Finding information in books: characteristics of full-text searches in a collection of 10 million books

نویسندگان

  • Craig Willis
  • Miles Efron
چکیده

Searching large collections of digitized books is a relatively new area in information-seeking and retrieval research, made possible by initiatives such as Google Books and the HathiTrust Digital Library. The availability of large full-text book collections is transforming how users search and interact with information in books, but the characteristics of these changes are unknown. This paper aims to provide insight into the characteristics of full-text searches in a large collection of digitized books and is the first step in a broader research agenda intended to improve book retrieval. To better understand the types of queries that users are issuing to full-text-book collections, we analyzed a full year of anonymized query logs from the HathiTrust Digital Library full-text search engine. We also manually classified a random sample of 600 queries to develop a taxonomy of book search query types. We found that users are beginning to search for information in books instead of searching for books. Searches still largely follow bibliographic models, but, as expected, new types of searches are beginning to take advantage of fulltext capabilities. Additionally, comparing the results of our query log analysis to searches in other domains, we found similar search patterns including short queries, sessions with only a few queries, and users viewing only a few pages of results per query. We discuss how these findings can be used to characterize users of large full-text book collections. Author

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Relationship between Text and Picture in the Selected Iranian and Contemporary American-European Illustrated-Fiction Books Based on the Theory of Maria Nikolajeva and Carole Scott

Illustrated-fiction books are special forms of art that are the combination of text and picture. The relationship between text and picture in this genre is diverse and variegated, and has different effects on the audience; however, little research has been done about it. The goal of this research is to compare text/picture relationship in the selected Iranian and contemporary American-European ...

متن کامل

Europe PMC: Quick tour

What is Europe PMC? Europe PMC [2] is a global, free, biomedical literature repository, providing access to worldwide life sciences articles, books, patents and clinical guidelines. The resource currently contains over 32 million abstracts and more than 4 million full-text articles (see Figure 1). A subset of the full-text information corpus is the open-access literature that can be downloaded ...

متن کامل

یک دهه نشر کتاب در حوزه علم اطلاعات و دانش‌شناسی ایران ( 1381-1390)

Abstract Introduction: Identification the status of book publishing in Knowledge and Information Science during 2002-2011 in Iran. Methods: This study is a practical research according to its aim. Research methodology is analytical survey. Data were collected through a checklist. The population consist of 632 books in field of Knowledge and Information Science published during 2002-2011. The ...

متن کامل

Characteristics of Arabic Identity in Intellectual System of Hisham Kalbi based on his Books on Genealogy

Science of "Genealogy" was one of the branches of History and Historiography during the age of Jāhilīyah (age of ignorance) which has grown rapidly in the Islamic era. In this context, Hisham Kalbi (d. 204 AH. / 819 AD.), as the first author and editor of Genealogy, has a great contribution to the formation and prosperity of this science, with two important texts, the Jamharat Al-Ansab and Nasa...

متن کامل

Surveying the Experts View on the Necessity of Revision in Rating of Children and Adolescents's Books

Background and Aim: The purpose of this study was to find out the  current status of non-academic rankings of children's books and survey the experts  view on the revision  scheme in  the classification of such books.     Method: The qualitative study was employed.  The research tool was a questionnaire based on the research objectives. Openended  interview data collection method was used based...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013